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Abstract. We present a new method for expressing Chaitin's random real, Q, 
through Diophantine equations. Where Chaitin's method causes a particular 
quantity to express the bits of Q by fluctuating between finite and infinite 
values, in our method this quantity is always finite and the bits of f2 are 
expressed in its fluctuations between odd and even values, allowing for some 
interesting developments. We then use exponential Diophantine equations to 
simplify this result and finally show how both methods can also be used to 
create polynomials which express the bits of Q in the number of positive values 
they assume. 
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1. Recursive Enumerability, Algorithmic Randomness and O 

One of the most startling recent developments in the theory of computation is the 
discovery of the number f2, through the subfield of algorithmic information theory. 

is a real number between and 1 which was introduced by G. J. Chaitin 
as an example of a number with two conflicting properties: it is both recursively 
enumerable and algorithmically random. Very roughly, this means that D, has a 
simple definition and can be computed in the limit from below, yet we can determine 
only finitely many of its digits with certainty — for the rest we can do no better than 
random. 

Understanding the full importance of these properties requires some familiarity 
with the recursive functions — commonly presented through models of computation 
such as Turing machines or the lambda calculus. For the purposes of algorithmic 
information theory, however, it is convenient to abstract some of the details from 
these models and consider a programming language in which the (partial) recursive 
functions are represented by finite binary strings.^ These strings are just programs 
for a universal Turing machine (or universal lambda expression) and they take input 
in the form of a binary string then output another binary string or diverge (fail to 
halt). For convenience, we will often consider these inputs and outputs to encode 
tuples of positive integers. 

On top of this simplified picture of computation, we impose one restriction which 
is necessary for the development of algorithmic information theory (and hence Q). 
The set of strings that encode the recursive functions must be prefix-free. This 
means that no program can be an extension of another, and thus each program is 
said to be self-delimiting. As algorithmic information theory is intricately linked 
with communication as well as computation, this is quite a natural constraint — 
if you wish to use a permanent binary communication channel, then you need to 



For more details see Chaitin 
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know when the end of a message has been reached and this cannot be done if some 
messages are extensions of others. 

There are many prefix-free sets that one could choose and many recursive map- 
pings between these and the recursive functions. These different choices of 'pro- 
gramming language' lead to different values of fi, but this does not matter much 
as almost all of its significant properties will remain the same regardless. How- 
ever, to allow talk of O as a specific real number we will use the same language as 
Chaitin Ej. 

Now that we have explained what we mean by a programming language, we can 
give a quick overview of computability in terms of programs. A program computes 
a set of n-tuples if, when provided with input (xi, . . . , it returns 1 if this is 
a member of the set and otherwise. A program computes an infinite sequence 
if, when provided with input n, it returns the value of the n-th element in the 
sequence. A program computes a real, r, if it computes a sequence of rationals 
{i^n} which converges to r and \r — r„| < These sets, sequences and reals that 
are computed by programs are said to be recursive. 

There are also many sets, sequences and reals that cannot be computed, but can 
be approximated in an important way. A program semi-computes a set of n-tuples 
if, when provided with input {xi, . . . it returns 1 if this is a member of the 

set and diverges otherwise. A program semi-computes an infinite sequence of bits 
if, when provided with input n, it returns 1 if the n-th bit in the sequence is 1 
and diverges otherwise. A program semi-computes a real, r, if, when provided with 
input n, it computes a rational number, r„, where {r„} converges to r from below. 
These sets, infinite bitstrings and reals that are semi-computed by programs are 
said to be recursively enumerable or r.e. 

There is an important point that needs to be made concerning reals and their rep- 
resentations. Each real number between and 1 has a binary expansion: a binary 
point followed by an infinite sequence of bits that represents the real.^ Throughout 
this paper, we shall be making considerable use of the binary expansions of real 
numbers so it is important to point out an oddity in the definitions above: a real is 
recursive if and only if its binary expansion is recursive, but a real may be r.e. even 
if its binary expansion is not r.e. We shall thus take care to distinguish the weaker 
property of being an r.e. real from the stronger one of being a real whose binary 
expansion is r.e. 

An example of a real that is r.e. but not recursive is r: the real number between 
and 1, whose fc-th digit is 1 if the fc-th program (in the usual lexical ordering 
of finite bitstrings) halts when given the empty string as input and if the fc-th 
program diverges. Equivalently: 



For numbers that can be expressed with a representation ending in an infinite string of O's, 
there is another representation ending in an infinite sequence of I's, but we shall remove this 
ambiguity by only using representations with an infinite number of O's. This will not affect the 
important reals in this paper, H and t, as they are irrational and thus have unique representations 
regardless. 



(1.1) 




Pn halts 
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T is an r.e. real because there is a computable sequence of rationals {r^}, where 

(1.2) r,= 2- 

Pn halts in <z steps 

such that {Ti} converges to r from below. 

Furthermore, it is clear that the binary representation of r is also r.e. because 
there is a program that simulates the fc-th program, halting if and only if it does. 
This program is a slightly modified universal program that first determines the bits 
of the fc-th program and then simulates it. 

r is not recursive, however, because if a program could compute it to arbitrary 
accuracy, it would determine whether each program halts or not when given the 
empty string as input. This is known as the blank tape problem and is easily shown 
to be equivalent to the more general halting problem — 'does a given program halt on 
a given input?'. The halting problem is fundamental to the theory of computation 
and is the most famous problem that cannot be recursively solved, r merely encodes 
the information necessary to solve the halting problem into the binary expansion 
of a real number and thus provides a very simple example of a non-computable real 
to which we can contrast the more exotic properties possessed by il. 

ri encodes the halting problem in a more subtle way: it is the halting probability. 
We could, theoretically, generate a random program one bit at a time, by flipping a 
fair coin and writing down a 1 when it comes up heads and a for tails — stopping 
if we reach a valid program. The chance of generating any given n bit program is 
therefore ^ . f2 is the chance that this method of random program construction 
generates a program that halts. Letting \p\ represent the size of p in bits, we can 
also express 51 as 

(1.3) ^= 2"'"' 

p halts 

As was the case for r, there is a computable sequence of rationals {f^i}, where 

(1.4) n,^ 2"'''' 

p halts in <z steps 

which converges to fl from below, showing it to be an r.e. real. However, we shall 
see shortly that the binary representation of Q is not r.e. 

A real is said to be algorithmically random 3 if and only if the 'algorithmic com- 
plexity' of each n-bit initial segment of its binary expansion becomes and remains 
arbitrarily greater than n.^ In other words a real, r, is algorithmically random if 
and only if any program that has access to outside advice in the form of binary 
messages requires more than n bits of advice to compute the first n bits of r's binary 
expansion (for all values of n above some threshold).'* Thus a random real is one 
for which only finitely many prefixes of its binary expansion can be compressed. 

''This is only one of four common definitions of algorithmic randomness, however, all have been 
shown to be equivalent. 

"^The reason that slightly more than n bits of advice are needed is because in algorithmic 
information theory the advice comes in self-delimiting messages (which are actually programs 
that generate the advice — like self-extracting archives) and in order to be self-delimiting, these 
messages need slightly more bits than they would otherwise. In general, an n bit string requires 
about (n -|- log n) bits. Chaitin p] provides further details. 
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It is easy to see that a random real cannot have an r.e. binary expansion. Let x 
be an arbitrary real whose binary expansion is r.e. By definition, there must be a 
program, p^^ that takes a positive integer, fc, and halts if and only if the /c-th bit of 
X is 1. To determine n bits of x, we just need to know how many of these n values 
of k make px halt. We could then simply run px on all the values of k and stop 
when this many have halted, knowing that no more will halt and thus determining 
the n bits of x. Since all positive integers less than n can be encoded in log n bits 
(rounding up), we only need to send a message of about (logn + log log n) bits. 
In this manner, any prefix of x can be significantly compressed, so x cannot be 
random. 

Because of this, we can see that r too is not random. However, Chaitin |H] has 
proven that f2 is random and so cannot be compressed in this manner.^ For suffi- 
ciently high values of n, n bits of f2 provide n bits of algorithmically incompressible 
information. 

In addition to recursive incompressibility, random reals are also characterised by 
recursive unpredictability [Sj. Consider a 'predictive' program that takes a finite 
initial segment of an infinite bitstring and returns a value indicating either 'the next 
bit is 1', 'the next bit is 0' or 'no prediction'. If any such program is run on all finite 
prefixes of the binary expansion of a random real and makes an infinite amount of 
predictions, the limiting relative frequency of correct predictions approaches ^. In 
other words when any program is used to predict infinitely many bits of a random 
real, such as f2, it does no better than random — even with information about all 
the prior bits. 

The power of this unpredictability can be seen when compare the predictability 
of T. In this case, the predictive program can easily predict an infinite amount of 
bits with no errors. This is because infinitely many bits of r are 'easy' to com- 
pute. For example, consider the halting behaviour of Turing machines: there are 
infinitely many Turing machines which have no loops in their transition graphs 
and thus cannot possibly diverge. When the predictive program is asked to pre- 
dict the n-th bit of r, it can just check to see if the n-th program corresponds to 
such a machine, returning 'the next bit is 1' if it docs and 'no prediction' otherwise.^ 

With its inherent incompressibility and unpredictability, Q really does go beyond 
the type of uncomputability present in a more typical non-recursive real such as 
T. However, its contrasting property of being an r.e. real makes seem to be just 
beyond our reach. In the next section, we will introduce Diophantine equations and 
show how these can be used to bring uncomputability into the more classical field 
of number theory. Then, in Section |31 we will show two ways of using Diophantine 
equations to bring il and randomness to number theory — Chaitin's original method 
and our new technique. 



Indeed, it has since been shown through the work of R. Solovay, C. S. Calude, P. Herthng, 
B. Khoussainov, Y. Wang and T. A. Slaman that the only r.e. random reals are f2's for different 
programming languages. See Calude ^ for more details. 

^FVom the definition of binary programs in algorithmic information theory, there must be a 
recursive mapping between programs and Turing machines (or any such model). 
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2. DioPHANTiNE Equations and Hilbert's Tenth Problem 

A Diophantine equation is a polynomial equation in which all of the coefficients 
and variables take only positive integer values. Many natural phenomena with 
discrete quantities are modelled well by Diophantine equations and they occur fre- 
quently in number theory. It is often convenient to express a Diophantine equation 
with all terms on the left hand side: 

(2.1) D{xi,...,Xm) = 

Here D is a polynomial of xi , . . . , Xm in which the coefficients can take both positive 
and negative integer values. 

The number of solutions for a Diophantine equation varies widely. For example, 
3xi +6 = has one solution, while xiX2 — 2 = has two and xiX2 — X2 = has 
infinitely many. Some however, such as 2 — Sxi = 0, have no solutions at all. There 
are many different methods for deciding whether Diophantine equations of certain 
forms have solutions and determining what these solutions are, but there has been 
a great desire for a single method that takes an arbitrary Diophantine equation 
and determines whether or not it has solutions. In 1900, David Hilbert 5 gave 
the problem of finding such a method as the tenth in his famous list of important 
problems to be addressed by mathematicians in the 20th Century. Since then, the 
task of finding this method has become known simply as Hilbert's Tenth Problem. 

Another area of research concerns families of Diophantine equations. A family 
of Diophantine equations is a relation of the form: 

(2.2) £)(ai, . . . ,a„,a;i, . . . ,Xm) = 

in which we distinguish between two types of variable. The variables xi, . . . , Xm are 
called unknowns, while oi, . . . ,a„ and called parameters. By assigning values to 
each of the parameters (and treating them as constants), we pick out an individual 
Diophantine equation from the family. For example, the family ai — 3a;i = consists 
of the equations: 1 — 'ixi = 0, 2 — ixi = 0, 3 — ixi — and so on. 

Each family of Diophantine equations is naturally associated with a certain set 
of n-tuples of positive integers, X), in the following manner: 

(2.3) (ai, . . . ,a„) e D . . . ^^^'(ai, . . . , a„, xi, . . . , x„) = 

In other words, a tuple is in the set if the equation it corresponds to has a solution. 
Such sets are said to be Diophantine or to have a Diophantine representation. For 
example, the set of all multiples of 3 is Diophantine because it is represented by 
the family ai — 3a;i = 0. 

Over the 1950's and 1960's, M. Davis, H. Putnam and J. Robinson established 
several important results regarding which sets are Diophantine. Their key result 
concerned a characterisation, not of Diophantine sets, but their close relation: ex- 
ponential Diophantine sets. 

A family of exponential Diophantine equations is a relation of the form: 

(2.4) D(ai,...,a„,xi,...,a;„,2"S...,2"'") = 

where D is once again a polynomial, but now some of its variables are exponential 
functions of others. Davis, Putnam and Robinson |3] used this additional flexibility 
to show that all r.e. sets are exponential Diophantine. It had long been known that 
all exponential (and standard) Diophantine sets are r.e. because it is trivial to write 
a program that searches for a solution to a given equation and halts if and only if it 
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finds one. Therefore, the new result meant that the exponential Diophantine sets 
were precisely the r.e. sets. 

In 1970, Yu. Matiyasevich completed the final step, proving that all exponen- 
tial Diophantine sets are also Diophantine and thus that the Diophantine sets are 
exactly the r.e. sets — a result now known as the dprm Theorem. 

The DPRM Theorem provides an intimate link between Diophantine equations 
and computability, reducing the task of determining whether a set has a Diophan- 
tine representation to a matter of programming. For instance, there is a program 
that takes a single input k and halts if and only if the fc-th bit of r is 1. Thus, 
the set of positive integers that includes k if and only if the fc-th program halts is 
an r.e. set and via the dprm Theorem, there is a family of Diophantine equations 
with a parameter fc, that has solutions if and only if the fc-th program halts. 

This family of equations provides an example of uncomputability in number 
theory and shows that Hilbert's Tenth Problem must be recursively undecidable 
because a program that finds whether arbitrary Diophantine equations have so- 
lutions could be used to determine the bits of r and thus to solve the halting 
problem. Indeed, it was long known that the recursive undecidability of Hilbert's 
Tenth Problem would follow immediately from the dprm Theorem and this was 
the main motivation for its proof — the Diophantine representations for all other 
r.e. sets being largely a bonus. 



3. Expressing Omega Through Diophantine Equations 

While the dprm Theorem demonstrates the existence of r and uncomputability 
in number theory, it also denies the possibility of finding a similar family of Dio- 
phantine equations expressing fl and randomness. This is due to the fact discussed 
in Section^that, while O is an r.e. real, its sequence of bits is not r.e. However, the 
DPRM Theorem only prohibits a direct Diophantine representation of fl and says 
nothing about the more subtle properties of Diophantine equations in which these 
bits could perhaps be encoded. 

Chaitin takes such an approach. While there is no program of one variable, 
fc, that halts if and only if the fc-th bit of is 1, Chaitin provides a program, P, 
that takes two variables, fc and N, and computes somewhat less directly. For a 
given value of fc, P can be thought of as making an infinite series of 'guesses' as to 
the value of the fc-th bit of fl — when P is run on fc and N, it gives the A^-th guess 
as to the fc-th bit of fl. What is impressive is that P gets infinitely many of these 
guesses right and only finitely many wrong. 

How does P do this? It simply computes the sequence {fii} discussed in Section^] 
until it gets to Sljv and then returns the fc-th bit of Sljv. Just as {fii} forms a 
sequence of approximations to Q, so the fc-th bit of each {fli} forms a sequence of 
approximations to the fc-th bit of fl. 

Consider this fc-th bit of each {ili} as i is increased. This bit could change 
between and 1 many times, but since {fli} approaches fl, it must eventually 
remain fixed, at which point it must have the same value as the fc-th bit of ^l. 
Therefore, if the fc-th bit of is 1, the fc-th bit of {fii} must be for only finitely 
many values of i, and so P must return for finitely many values of N and 1 for 
infinitely many. On the other hand, if the fc-th bit of is 0, then the fc-th bit of 
{fij} must be 1 for only a finite number of values of i and P must return 1 for 
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finitely many values of N and for infinitely many. Either way, as N increases, the 
output of P applied to k and N limits to the fc-th bit of il. 

It may seem as though this program is computing the bits of Q but this is not 
quite the case. P just computes the iV-th 'guess' of the fc-th bit. From the infinite 
sequence of such guesses, the fc-th bit could be determined but P does not and 
cannot put the guesses together like that — it just returns one of them. 

Since recursive functions are just a special type of r.e. function, we can apply 
the DPRM Theorem and see that there must be a family of Diophantine equations 

(3.1) xi{k,N,xi,...,Xm) = 

that has solutions for given values of fc and N if and only if P returns 1 when 
provided with these as input. For a given value of fc, there are solutions for infinitely 
many values of N if and only if the fc-th bit of O is 1. 

Thus, by using a more subtle property of the family of Diophantine equations, 
Chaitin was able to show that algorithmic randomness occurs in number theory: as 
fc is varied, there is simply no recursive pattern to whether this family of equations 
has solutions for finitely or infinitely many values of N. 

By modifying Chaitin's method slightly, we can find a new way of expressing the 
bits of n through a family of Diophantine equations [2| . Consider a new program, 
Q, that also takes inputs fc and N, and begins to compute the sequence {^i}. For 
each value of 17^, Q checks to see if it is greater than halting if this is so, and 
continuing through the sequence otherwise. Since {^i} approaches from below, 
we can see that fli > ^ implies that ft > ^ and conversely, if SI > there must 
be some value of i such that Sli > Therefore, Q will halt on fc and N if and 
only if 17 > . Alternatively, we could say that Q recursively enumerates the pairs 
(fc,iV) such that n> 

Just as we could determine the fc-th bit of Q from the number of values of N 
that make P return 1, so we can determine it from the number of values of N for 
which Q halts. In what follows, we shall refer to these quantities as as pk and qk 
respectively. 

Unlike pk, qk is always finite. Indeed, an upper bound is easily found. Since 

< 1, only values of fc and N such that |^ < 1 can possibly be less than and 
thus make Q halt. Since both fc and TV take only values from the positive integers 
we also know that > and thus for a given fc, there are less than 2^ values of 
N for which Q halts and gj, e {0, 1, . . . , 2*^ - 1} . 

From the value of qk, it is quite easy to derive the first fc bits of fi. Firstly, note 
that qk is equal to the largest value of N such that ^ < ft — unless there is no 
such N, in which case it equals 0. Either way, its value can be used to provide 
a very tight bound on the value of fi: ^ < ^ < Since is irrational, we 

can strengthen this to ^ < < which means that the first fc bits of |p are 

exactly the first fc bits of fi. 

This gives some nice results connecting qk and f2. The first fc bits of ^ are just 
the bits of qk when written with enough leading zeros to make fc digits in total. 
Thus qk, when written in this manner, provides the first fc bits of fi. Additionally, 
we can see that qk is odd if and only if the fc-th bit of O is 1. 

Now that we know the power and flexibility of qk , it is a simple matter to follow 
Chaitin in bringing these results to number theory. The function computed by Q 
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is r.e. so, by the dprm Theorem, there must be a family of Diophantine equations 

(3.2) X2ik,N,xi,...,Xm) = 

that has a solution for specified values of k and N if and only if Q halts when given 
these values as inputs. Therefore, for a particular value of k, this equation only has 
solutions for values of N between and 2*^ — 1 with the number of solutions, qk, 
being odd if and only if the k-th bit of is 1 . 

This new family of Diophantine equations improves upon the original one in a 
couple of ways. Whereas the first method expressed the bits of H. in the fluctuations 
between a finite and infinite amount of values of N that give solutions, the second 
keeps this value finite and bounded, with the bits of expressed through the more 
mundane property of parity. It is the fact that this quantity is always finite that 
leads to many of the new features of this family of Diophantine equations, pk is 
infinite when the k-th bit of 51 is 1 and, since there is only one way in which it 
can be infinite, it can provide no more than this one bit of information. On the 
other hand, qk can be odd (or even) in 2*"'"^ ways, which is enough to give fc — 1 
additional bits of information, allowing the first k bits of Q to be determined. 

The fact that qk is always finite also provides a direct reduction of the problem of 
determining the bits of H. to Hilbert's Tenth Problem. To find the first k bits of fl, 
one need only determine for how many values of N the new family of Diophantine 
equations has solutions. Since we know that there can be no solutions for values 
of N greater than or equal to 2*^, we could determine the first k bits of fl from 
the solutions to 2*^ instances of Hilbert's Tenth Problem. In fact, we can lower this 
number by taking advantage of the fact that if there is a solution for a given value 
of then there are solutions for all lower values. All we need is to find the highest 
value of N for which there is a solution and we can do this with a bisection search, 
requiring the solution of only k instances of Hilbert's Tenth Problem.'' 

Finally, the fact that qk is always finite allows the generalisation of these results 
from binary to any other base, b. If we replace all above references to 2'^ with 6'^ we 
get a new program, Qb, with its associated family of Diophantine equations. For 
this family, the value of qk now gives us the first k digits of the base b expansion of 
f2: it is simply the base 6 representation of qk with enough leading zeroes to give k 
digits. The value of the fc-th digit of is simply qk mod b. 

Chaitin [S] did not stop with his Diophantine representation of fl, but instead 
moved to exponential Diophantine equations where his result could be presented 
more clearly. He made this move to take advantage of the theorem that all r.e. sets 
have singlefold exponential Diophantine representations, where a representation is 
singlefold if each equation in the family has at most one solution. 

We can denote the singlefold family of exponential Diophantine equations for 
the program P by 

(3.3) xUk,N,xi,...,x„,.)^0 

For a given fc, this equation will have exactly one solution for each of infinitely many 
values of N if the fc-th bit of f2 is 1 and exactly one solution for each of finitely 
many values of N if the fc-th bit of r2 is 0. We can make use of this to express the 
bits of through a more intuitive property. 



''For details see |7|. 
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If we treat N in this equation as an unknown instead of a parameter, we get 
a new (very similar) family of exponential Diophantine equations with only one 
parameter 

(3.4) Xi{k,xo,xi,...,x„i')^0 

Since the previous family was singlefold and N has become another unknown, there 
will be exactly one solution to this single parameter family for each value of N that 
gave a solution to the double parameter family. Thus, H3.4(l has infinitely many 
solutions if and only if the fc-th bit of 17 is 1 . 

This same approach can be used with our method jSj . There is a two-parameter 
singlefold family of exponential Diophantine equations for Q and this can be con- 
verted to a single parameter family of exponential Diophantine equations 

(3.5) X2ik,xo,xi,...,Xm') = 

with between and 2*^ — 1 solutions, the quantity being odd if and only if the k-th 
bit of r2 is 1. 

Finally, we have also shown [2] that both Chaitin's finitude-based method and 
our parity-based method can be used to generate polynomials for il. For a given 
family of Diophantine equations with two parameters, 

(3.6) D{k,N,xi,...,x,n)--0 
we can construct a polynomial, where 

(3.7) W{k, xo,xi, . . .,x„i) =xo{l~ {D{k, xo,xi,. . . ,Xm))^) • 

Note that the parameter, N, is again treated as an unknown and thus denoted xq. 

If we restrict the values of the variables to positive integers then, for a given fc, 
this polynomial takes on exactly the set of all values of N for which H3.6|) has solu- 
tions. We can thus use this method on xi = and X2 — 0, generating polynomials 
that express pk and qk in the number of distinct positive integer values they take on 
for different values of k. We therefore have a polynomial whose number of distinct 
positive integer values fluctuates from odd to even and back in an algorithmically 
random manner as a parameter k is increased. 
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